No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models
نویسندگان
چکیده
For decades, context-dependent phonemes have been the dominant sub-word unit for conventional acoustic modeling systems. This status quo has begun to be challenged recently by end-to-end models which seek to combine acoustic, pronunciation, and language model components into a single neural network. Such systems, which typically predict graphemes or words, simplify the recognition process since they remove the need for a separate expert-curated pronunciation lexicon to map from phoneme-based units to words. However, there has been little previous work comparing phoneme-based versus grapheme-based sub-word units in the end-to-end modeling framework, to determine whether the gains from such approaches are primarily due to the new probabilistic model, or from the joint learning of the various components with grapheme-based units. In this work, we conduct detailed experiments which are aimed at quantifying the value of phoneme-based pronunciation lexica in the context of end-to-end models. We examine phoneme-based end-to-end models, which are contrasted against grapheme-based ones on a large vocabulary English Voice-search task, where we find that graphemes do indeed outperform phonemes. We also compare grapheme and phoneme-based approaches on a multi-dialect English task, which once again confirm the superiority of graphemes, greatly simplifying the system for recognizing multiple dialects.
منابع مشابه
Computer Assisted Pronunciation Teaching (CAPT) and Pedagogy: Improving EFL learners’ Pronunciation Using Clear Pronunciation 2 Software
This study examined the impact of Clear Pronunciation 2 software on teaching English suprasegmental features, focusing on stress, rhythm and intonation. In particular, the software covers five topics in relation to suprasegmental features including consonant cluster, word stress, connected speech, sentence stress and intonation. Seven Iranian EFL learners participated in this study. The study l...
متن کاملExperimental detection of vowel pronunciation variants in Amharic
The pronunciation lexicon is a fundamental element in an automatic speech transcription system. It associates each lexical entry (usually a grapheme), with one or more phonemic or phone-like forms, the pronunciation variants. Thorough knowledge of the target language is a priori necessary to establish the pronunciation baseforms and variants. The reliance on human expertise can pose difficultie...
متن کاملEvaluating the Link between Word Frequencies and Pronunciation Variants: a Cross-lingual Study on Read and Spontaneous Speech
The aim of this contribution is twofold: evaluating the use of pronunciation variants in read and spontaneous speech and studying the link between word frequencies and pronunciation variants. The dependance of pronunciation variants on a given system connguration is also addressed in the rst part. For the second aspect of this work diierent variant types are deened. A cross-lingual study is car...
متن کاملMorphological approaches for an English pronunciation lexicon
Most pronunciation lexica for speech synthesis in English take no account of morphology. Here we demonstrate the benefits of including a morphological breakdown in the transcription. These include maintaining consistency, developing the symbol set and providing the environmental description for allophones and phonetic variables. Our approach does not use a full morphological generator, but incl...
متن کاملModels of EFL Learners’ Vocabulary Development: Spreading Activation vs. Hierarchical Network Model
Semantic network approaches view organization or representation of internal lexicon in the form of either spreading or hierarchical system identified, respectively, as Spreading Activation Model (SAM) and Hi- erarchical Network Model (HNM). However, the validity of either model is amongst the intact issues in the literature which can be studied through basing the instruction compatible wi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1712.01864 شماره
صفحات -
تاریخ انتشار 2017